What are the characteristics of the best bagels in NYC, at least according to Highley Varlet at https://everythingiseverything.nyc/?
(Thanks again to Mike @ https://highleyvarlet.com/ for sharing their bagel data!)
Out of 202 reviews, most took place in Brooklyn, Manhattan, and Queens.
In all categories, most of the reviews had scores of 3 to 4.5.
The distribution of scores was fairly similar between boroughs, though Manhattan had a higher concentration of reviews with higher Total scores. This seems to be driven by the overall higher Cheese scores and overall fewer middle-range Bagel scores in Manhattan.
The distribution of ratings was pretty uniform for most bagel characteristics. In other words, a good spread (ha ha) of different bagel types was covered in these reviews. The exception was bagel size: most of the reviewed bagels were larger as opposed to smaller.
Were bagel characteristics correlated? For example, did large bagels also tend to be more salty, or did bagels with dairy-forward cream cheese tend to be in more contemporary stores?
To answer this question, we looked at the correlation coefficient between every pair of characteristics. Correlations closer to 1 or -1 signify strong positive or negative (linear) relationships, respectively. Correlations closer to 0 signify no (linear) relationship.
Surprisingly, the characteristics were not correlated, for the most part. We did note a moderately strong relationship between topping density and salt levels (correlation = 0.4), which makes sense.
We wanted to see if the information in the bagel characteristics (bagel size, salt level, etc.) was predictive of the overall score. To do this, we considered a simple prediction model, starting with the tried-and-true methods:
Note: Future work will involve more sophisticated regression and machine learning models that can accomodate non-linearity between the characteristics and the total score (as the random forest does). As the total score was not quite Gaussian but rather a score from 1 to 5, it may be of interest to consider parametric regression models with ordered and multinomial outcomes. Finally, we are very interested in insights that we can gain from available the spatial and imaging data, such as: do higher-scoring bagels tend to cluster geographically?
First, we consider a model where only the bagel characteristics are used to predict total score.
| Dependent variable: | |
| Score_Total | |
| c_contemporary_classic | -0.003 |
| (0.005) | |
| c_variety_focused | -0.007 |
| (0.005) | |
| c_crackly_chewy | 0.006 |
| (0.008) | |
| c_xlarge_small | 0.027*** |
| (0.008) | |
| c_toppingdense_light | 0.005 |
| (0.006) | |
| c_highsalt_low | 0.019*** |
| (0.006) | |
| c_finescallion_coarse | -0.008 |
| (0.006) | |
| c_dairyfrwrd_latent | -0.025*** |
| (0.006) | |
| Constant | 3.638*** |
| (0.057) | |
| Observations | 198 |
| R2 | 0.225 |
| Adjusted R2 | 0.192 |
| Residual Std. Error | 0.460 (df = 189) |
| F Statistic | 6.843*** (df = 8; 189) |
| Note: | p<0.1; p<0.05; p<0.01 |
The most important predictors of total score were:
Note: overall R-squared was not super high, suggesting that this model does not do the best job at predicting score.
Next, we consider a model where the bagel characteristics and borough are used to predict total score.
| Dependent variable: | |
| Score_Total | |
| c_contemporary_classic | -0.006 |
| (0.006) | |
| c_variety_focused | -0.007 |
| (0.005) | |
| c_crackly_chewy | 0.007 |
| (0.008) | |
| c_xlarge_small | 0.029*** |
| (0.008) | |
| c_toppingdense_light | 0.004 |
| (0.006) | |
| c_highsalt_low | 0.020*** |
| (0.005) | |
| c_finescallion_coarse | -0.008 |
| (0.007) | |
| c_dairyfrwrd_latent | -0.023*** |
| (0.006) | |
| BoroughBrooklyn | 0.275* |
| (0.150) | |
| BoroughManhattan | 0.293* |
| (0.152) | |
| BoroughQueens | 0.206 |
| (0.154) | |
| BoroughStaten Island | 0.058 |
| (0.176) | |
| Constant | 3.406*** |
| (0.145) | |
| Observations | 198 |
| R2 | 0.252 |
| Adjusted R2 | 0.204 |
| Residual Std. Error | 0.457 (df = 185) |
| F Statistic | 5.207*** (df = 12; 185) |
| Note: | p<0.1; p<0.05; p<0.01 |
The most important predictors of total score were:
Note: overall R-squared was again not super high, suggesting that this model does not do the best job at predicting score.
Although bagels were rated along 8 different characteristics, it is possible that they represent a smaller number of underlying, latent characteristics. That is, these bagels belong to some subgroup of bagels that is described by a combination of the 8 characteristics. These latent subgroups are called principal components (PCs):
##
## Loadings:
## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## c_contemporary_classic 0.218 0.196 0.862 0.280 0.235 0.142
## c_variety_focused 0.958 -0.156 -0.169 -0.125
## c_crackly_chewy -0.123 -0.116 -0.152 0.960 0.106
## c_xlarge_small -0.123 -0.271 0.948
## c_toppingdense_light -0.533 0.148 0.146 0.189 -0.791
## c_highsalt_low -0.803 0.179 0.539 0.102
## c_finescallion_coarse -0.510 0.789 -0.190 0.248
## c_dairyfrwrd_latent -0.404 0.761 0.446 0.205
##
## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## SS loadings 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
## Proportion Var 0.125 0.125 0.125 0.125 0.125 0.125 0.125 0.125
## Cumulative Var 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
Instead of predicting the score using the 8 bagel features, we consider predicting the score using the 5 latent groups of bagels described above.
| Dependent variable: | |
| Score_Total | |
| Comp.1 | -0.019*** |
| (0.004) | |
| Comp.2 | -0.007 |
| (0.005) | |
| Comp.3 | 0.012** |
| (0.005) | |
| Comp.4 | -0.015** |
| (0.006) | |
| Comp.5 | -0.026*** |
| (0.006) | |
| Constant | 3.753*** |
| (0.033) | |
| Observations | 198 |
| R2 | 0.198 |
| Adjusted R2 | 0.177 |
| Residual Std. Error | 0.465 (df = 192) |
| F Statistic | 9.501*** (df = 5; 192) |
| Note: | p<0.1; p<0.05; p<0.01 |
The most important predictors of total score were:
Overall, these findings suggest that the cream cheese probably impacts the scores the most, and that more dairyness is often associated with lower scores.
These correlations also give insight into the strength and directions of relationships between bagel characteristics and the bagel score:
Correlations greater than 0 signify a positive relationship (e.g., the higher the salt, the higher the bagel score); correlations less than 0 signify a negative relationship (e.g., the more dairy-forward the cream cheese, the lower the cheese score.)
The random forest a machine learning algorithm that models a flexible, non-linear relationship between the predictors (bagel characteristics) and outcome (score) through an ensemble of regression trees.
To understand which predictors are the most important, we considered
importance measured in %IncMSE, which quantifies the
increase in error (MSE) after permuting that predictor over all trees
within a random forest model. Higher values correspond to higher
importance.
| Characteristic | %IncMSE |
|---|---|
| Contemporary Store (vs. Classic) | 13.23 |
| High Salt (vs. Lower) | 11.91 |
| Dairy-Forward Cream Cheese | 11.73 |
| Large Bagels (vs. Smaller) | 7.989 |
| Topping Dense (vs. Less Dense) | 5.434 |
| High Variety Store (vs. Focused) | 1.66 |
| Crackly Bagel (vs. Chewier) | 0.1327 |
| Fine Scallion Cream Cheese (vs. Coarser) | -2.047 |
We find that store characteristics, salt level, and cream cheese dairyness were again most important, followed by bagel size and topping density.
Thanks again to Mike for providing the bagel review data, and to Nick Illenberger (NYU) and Sarah Weinstein (Penn) for their assistance in these analyses.